Top-k Correlative Graph Mining

نویسندگان

  • Yiping Ke
  • James Cheng
  • Jeffrey Xu Yu
چکیده

Correlation mining has been widely studied due to its ability for discovering the underlying occurrence dependency between objects. However, correlation mining in graph databases is expensive due to the complexity of graph data. In this paper, we study the problem of mining top-k correlative subgraphs in the database, which share similar occurrence distributions with a given query graph. The search space of the problem is prohibitively large since every subgraph in the database is a candidate. We propose an efficient algorithm, TopCor, which mines the top-k correlative graphs by exploring only the candidate graphs in the projected database of a query graph. We develop three key techniques for TopCor: an effective correlation checking mechanism, a powerful pruning criteria, and a set of useful rules for candidate exploration. The three key techniques are very effective in directing the search to those highly correlative candidate graphs. We justify by experiments the effectiveness of the three key techniques and show that TopCor is more than an order of magnitude faster than CGSearch, the state-of-the-art threshold-based correlative graph mining algorithm.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Mining Top-K Graph Patterns that Jointly Maximize Some Significance Measure

Most of graph pattern mining algorithms focus on finding frequent subgraphs and its compact representations, such as closed frequent subgraphs and maximal frequent subgraphs. However, little attention has been paid to mining graph patterns with user-specified significance measure. In this paper, we study a new problem of mining top-k graph patterns that jointly maximize some significance measur...

متن کامل

Pushing Constraints to Generate Top-K Closed Sequential Graph Patterns

In this paper, the problem of finding sequential patterns from graph databases is investigated. Two serious issues dealt in this paper are efficiency and effectiveness of mining algorithm. A huge volume of sequential patterns has been generated out of which most of them are uninteresting. The users have to go through a large number of patterns to find interesting results. In order to improve th...

متن کامل

TGP: Mining Top-K Frequent Closed Graph Pattern without Minimum Support

In this paper, we propose a new mining task: mining top-k frequent closed graph patterns without minimum support. Most previous frequent graph pattern mining works require the specification of a minimum support threshold to perform the mining. However it is difficult for users to set a suitable value sometimes. We develop an efficient algorithm, called TGP, to mine patterns without minimum supp...

متن کامل

Top-K Correlation Sub-graph Search in Graph Databases

Recently, due to its wide applications, (similar) subgraph search has attracted a lot of attentions from database and data mining community, such as [13, 18, 19, 5]. In [8], Ke et al. first proposed correlation sub-graph search problem (CGSearch for short) to capture the underlying dependency between subgraphs in a graph database, that is CGS algorithm. However, CGS algorithm requires the speci...

متن کامل

Efficient Mining of Top-k Breaker Emerging Subgraph Patterns from Graph Datasets

This paper introduces a new type of discriminative subgraph pattern called breaker emerging subgraph pattern by introducing three constraints and two new concepts: base and breaker. A breaker emerging subgraph pattern consists of three subpatterns: a constrained emerging subgraph pattern, a set of bases and a set of breakers. An efficient approach is proposed for the discovery of top-k breaker ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2009